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ROUND  ROBLN  SCHEDLTING  FOR  FAIR  FLOW  CONTROL 
L\  DATA  COMMUNICATION  NETWORKS 


Ellen  L.  Hahne  and  Robert  G.  Gallager 


Massachusetts  Institute  of  Technology 
Cambridge,  Massachusetts  02139 


ABSTRACT 


Round  robin  link  scheduling,  in  eonjuriction  with  conventional  window  flow  control,  can  be 
used  to  achieve  throughput  fairness  in  point-to-point  packet  networks  with  virtual  circuit 
routing. 


i.  INTRODUCTION 


Consider  a  data  communication  network  consisting  of  store-and-forward  nodes  joined  by 
point-to-point  links.  Each  user  session  is  assigned  a  fixed  path  (often  called  a  virtual  circuit) 
through  the  network,  and  data  for  the  session  are  sent  in  packets  along  this  path.  In  such  a 
network  it  is  possible  for  the  incoming  trafTic  rate  at  a  node  to  exceed  the  outgoing  rate,  causing 
a  data  queue  to  build  up  at  that  node.  This  queue  may  eventually  overflow  the  node's  storage 
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space,  or  the  delay  of  aclcnowledgments  may  cause  transmitters  to  assume  that  data  were  lost. 
These  problems  result  in  wasteful  retransmissions  that  effectively  reduce  the  capacity  of  the 
network.  Flow  control  procedures  attempt  to  prevent  or  alleviate  this  degradation  by 
regulating  the  appropriate  traffic  sources.  Reference  1  discusses  many  of  the  flow  control 


techniques  that  have  been  proposed  in  the  literature.  ^ 

One  such  scheme  is  the  winJaw  mtthai  [ij.  This  technique  limits  the  number  of  packets  for 
each  session  which  have  been  transmitted  but  for  which  acknowledgments  have  not  yet  been 
received.  The  maximum  permissible  number  of  outstanding  packets  is  called  the  window  tize. 
A  single  window  may  be  applied  to  all  of  a  session’s  traffic,  or  the  session  may  have  a  separate 
window  for  its  traffic  over  each  link.  We  mention  the  window  method  because  it  is  a  component 


of  several  more  elaborate  strategies  to  be  discussed  later. 


It  would  be  desirable  for  flow  control  procedures  to  regulate  net'^ork  inputs  so  as  to  grant 
each  session  a  fair  throughput  rate.  As  explained  in  Reference  1,  many  proposed  flow  control 
methods  are  rather  unfair.  Several  studies  have,  however,  addressed  the  fairness  issue,  and  we 
will  briefly  discuss  these  now. 


The  problem  of  achieving  throughput  fairness  can  be  broken  down  into  three  parts.  First 
the  fairness  objective  must  be  formulated  precisely.  Then  the  fair  session  throughputs  must  be 
determined.  Finally,  these  rates  must  be  enforced.  The  objective  of  Gallager  and  Golestaani  [2] 
is  to  minimise  a  sum  of  penalty  functions,  one  for  each  link  and  one  for  each  session.  The  link 
functions  penalise  high  link  delays,  while  the  session  functions  penalise  low  session  throughputs. 
This  objective  function  expresses  a  trade-off  between  overall  network  efficiency  and  user 
fairness.  Another  fairness  criterion,  called  mez-mia  flow,  is  used  in  various  forms  by  Bially, 
Gold,  and  Seneff  [3],  Jaffe  [4|,  Hayden  [5],  Gafni  and  Bertsekas  [6],  and  Mosely  [7].  We  will 
define  only  the  simplest  version  of  this  objective,  which  b  Hayden’s.  To  satbfy  the  max-min 
flow  criterion,  the  smallest  session  rate  in  the  network  must  be  as  large  as  possible.  Subject  to 
thb  constraint,  the  second-smallest  session  rate  must  be  as  large  as  possible,  etc.  Given  a 
network  with  its  link  capacities  and  a  set  of  sessions  with  their  routes,  there  b  a  unique  set  of 
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session  rates  that  satisfies  the  max>min  conditions.  Section  2  explains  the  max-min  flow 
criterion  in  more  detail.  We  will  adopt  this  criterion  as  the  definition  of  fairness  for  this  paper. 
The  studies  mentioned  in  this  paragraph  also  develop  distributed  algorithms  for  computing 
session  rates  that  are  fair  according  to  the  various  criteria. 

Once  the  desired  session  rates  are  computed,  there  are  several  ways  to  enforce  them. 
Hayden  [3j  and  VIosely  [7;  simulate  a  session  input  control  that  produces  packet  lengths 
proportional  to  the  desired  session  rate.  The  time  between  packet  admissions  is  approximately 
constant.  This  control  4s  particularly  meaningful  for  packetized  voice  traffic:  it  represents  the 
output  of  a  variable  rate  vocoder  [3].  Bially,  Gold,  and  Seneff  [3]  simulate  a  similar  control. 
Another  possibility  is  to  use  faed  length  packets,  but  to  regulate  the  time  between  packet 
admissions  ^'ir  each  session.  Mukherji  [8]  does  this  in  a  way  that  is  less  rigid  than  time>division 
multiplexing  and  thus  avoids  the  delay  problems  of  TDM  under  light  loads.  A  third  approach, 
studied  by  Gallager  and  Golestaani  [2],  uses  window  Dow  control  and  adjusts  the  sessions' 
window  sizes  to  achieve  the  desired  rates. 

In  this  paper  we  propose  another  strategy  for  achieving  max>min  fair  session  rates.  Let  each 
link  offer  its  packet  transmission  slots  to  its  user  sessions  in  round  robin  fashion.  If  a  session  b 
offered  a  chance  to  use  a  link  slot  but  has  no  packets  ready,  then  that  same  slot  is  offered  to  the 
next  session,  and  perhaps  the  next,  etc.,  until  a  ready  session  is  found.  In  each  pass  of  a  link’s 
round  robin,  a  session  may  transmit  only  one  packet.  In  order  to  prevent  an  excessively  long 
queue  at  a  session’s  bottleneck  link,  window  flow  control  is  also  employed.  Under  certain 
simplifying  assumptions,  it  can  be  shown  that  this  strategy  yields  long*term  average  session 
throughputs  that  are  max-min  fair.  This  and  related  results  are  covered  in  Section  4.  Section  3 
gives  the  assumptions  underlying  the  results.  The  most  noteworthy  assumptions  are  that  all 
sessions  have  heavy  demand  and  that  the  window  sizes  are  sufficiently  large. 

The  attraction  of  this  method  is  its  simplicity.  Note  that  the  fair  rates  are  never  explicitly 
computed,  as  they  are  for  other  fair  flow  control  schemes.  The  only  overhead  communication  is 
that  required  for  the  window  acknowledgments.  The  window  sizes  do  not  need  to  be  adjusted 
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as  network  conditions  change.  A  potential  practical  problem  with  this  method  is  that  the 
windows  may  need  to  be  large  in  order  to  guarantee  throughput  fairness  for  some  networks. 
This  point  is  discussed  in  Section  3.  Section  6  summarizes  and  concludes  the  paper. 

i.  FAIRSESS  CRlTERfO.yr 

Thb  section  describes  the  simplest  version  of  the  max-min  flow  criterion,  which  we  will  take 
as  our  definition  of  throughput  fairness.  First,  let  us  describe  the  network  flow  model  in  terms 
of  which  the  criterion  is  defined.  The  network  consists  of  nodes  joined  by  directed  links.  Two 
nodes  may  be  connected  by  any  number  of  links  in  either  or  both  directions.  The  links  hare 
Hnite  capacity.  The  topology  of  the  network  and  the  link  capacities  are  given.  A  set  of  n  one¬ 
way  communication  sessions  Zi,  -  ■  ■  ,  z.  has  been  speciHed,  and  each  session  has  been  assigned 
a  path  (i.e.,  a  sequence  of  appropriately  directed  links)  through  the  network.  The  goal  is  to 
assign  a  feasible  transmission  rate  ry  to  each,  session  z^-  so  as  to  treat  sessions  fairly.  It  is 
assumed  that  the  traffic  for.  each  session  will  form  a  continuous,  steady  flow  at  the  assigned 
rate. 

Next,  let  us  define  some  terms.  An  e/feeetiaa  R  »  (r|,  ■  '  ■  ,  r.)  specifies  a  non-negative 
real  rate  for  each  session  Zy  without  violating  the  link  capacities;  that  is,  the  sum  of  the  rates 
for  all  sessions  sharing  any  particular  link  cannot  exceed  the  link’s  capacity.  The  r»U  lut  of  an 
allocation  R  »  (r|,  ■  ■  ■  ,  r,)  is  the  nondeereasing  permutation  of  R.  Note  that  the  elements 
of  a  rate  lut  are  not  necessarily  dutinct. 

Now  fairness  can  be  defined.  An  allocation  R  satisfies  the  mai-min  flow  criterion  if  no  other 
alloeation  has  a  rate  list  that  u  lexicographically  greater  than  the  rate  list  of  A .  In  other  words, 
the  smallest  component  rate  of  A  u  as  large  as  possible  and,  subject  to  that  constraint,  the 
second-smallest  component  rate  of  A  is  as  large  as  possible,  etc.  Each  of  these  nested 
optimisation  problems  can  be  formulated  as  a  linear  program  [5|,  and  it  can  be  shown  that  there 
exists  a  unique  alloeation  that  satufies  them  all. 

This  max-min  allocation  will  be  called  the  foir  allocation,  because  it  can  be  shown  to  be  the 


only  allocation  with  the  following  property:  a  session  xj  cannot  transmit  above  its  assigned  rate 
ry  unless  some  session  with  assigned  rate  <  ry  transmits  below  its  assigned  rate. 


Alternatively,  the  max-min  flow  criterion  can  be  stated  in  terms  of  bottlenecks.  Suppose 
some  allocation  is  given.  A  link  /  is  called  a  ioUteneek  for  a  session  zy  using  /  if  the  assigned 
rate  ry  of  zy  is  at  least  as  large  as  the  assigned  rate  of  any  other  session  using  I,  and  if  the  entire 
capacity  of  /  is  assigned  to  the  sessions  using  it.  It  can  be  shown  that  an  allocation  satisfies  the 
max-min  How  criterion  if  and  only  if  every  session  has  at  least  one  bottleneck  link. 

These  concepts  will  now  be  illustrated  for  the  system  in  Figure  1. 
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The  network  consists  of  links  /|,  l^,  1%,  and  in  tandem.  E^ch  link  has  unit  capacity.  Sessions 
Z|,  Z2  and  Z3  use  only  link  f|.  Session  z^  uses  ail  four  links.  Sessions  z^  and  Z|  use  only  link  fj. 
Session  Xy  uses  /j  followed  by  Sessions  Z(  and  Z(  use  only  The  max-min  allocation  is 

(  l/^,  I/^,  1/4,  1/-1, 3/8, 3/8,  1/-1,  1/-1,  1/4  ).  The  rale  list  for  this  allocation  is 

(  1/4,  1/4,  1/4,  1/4,  1/4,  1/4,  1/4,  3/8,  3/8  ).  Sessions  Z|,  z«,  z„  and  z,  have  link  /,  as  a 

bottleneck.  Sessions  Z5  and  Xf  have  fg  as  a  bottleneck.  Sessions  Z4,  xy,  Zg,  and  Zg  have  fg  as  a 
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bottleneck;  note  that  is  bottlenecked  at  both  /{  and  l^.  Link  /j  is  not  a  bottleneck  for  any 
session,  since  it  has  unused  capacity. 

S.  SYSTEM  \IODEL 

This  section  presents  the  s3rsteni  model  assumed  in  Section  4.  This  model  is  slightly  less 
general  than  that  of  Section  2  with  regard  to  network  topology  and  link  capacities,  tbwever, 
the  main  difference  between  the  two  models  is  that  Section  2  assumes  a  continuous,  steady 
trafHc  flow  for  each  session,  whereas  this  section  explicitly  models  packets,  queues,  windows, 
and  schedules. 

In  this  detailed  model,  the  network  consists  of  store-and>forward  nodes  joined  by  point-to- 
point,  one-way  communication  links.  If  two  nodes  are  connected  by  link(s)  in  one  direction, 
then  they  must  be  connected  by  at  least  one  link  in  the  reverse  direction  so  that  flow  control 
acknowledgments  can  be  returned.  Links  and  nodes  are  error-free  and  perfectly  reliable.  The 
storage  capacity  of  each  node  is  large  enough  that  overflow  is  impossible. 

All  links  have  the  same  capacity,  and  all  data  packets  have  the  same  length.  A  packet 
experiences  no  processing  delay  at  a  node,  other  than  a  possible  queuing  delay  as  it  waits  for 
transmission.  A  packet  experiences  no  propagation  delay  on  a  link.  The  packet  transmission 
slots  of  all  links  are  synchronised.  Thus  the  entire  system  operates  with  slotted  time. 

Each  session  consbts  of  a  one-way  flow  of  data  packets  from  some  origin  node  to  some 
destination  node.  Several  sessions  may  have  the  same  origin  and  destination  nodes.  During  the 
time  interval  in  which  the  system  is  analysed,  the  set  of  sessions  using  the  network  is  fixed. 
Each  session  is  assigned  a  path  (i.e.,  a  sequence  of  appropriately  directed  links)  through  the 
network.  It  b  assumed  that  a  session  always  has  packets  waiting  at  its  origin  node  and  has 
storage  available  at  its  destination  node.  Since  the  sessions  and  their  routes  are  given  and  fixed, 
the  max-min  fair  session  rates  of  Section  2  are  well-defined  and  do  not  change  over  time.  (Of 
course,  it  b  not  clear  at  thb  point  whether  the  average  packet  flows  of  the  sessions  will  actually 
match  these  ideal  rates.)  The  bottleneck  link(s)  for  each  session  are  also  well-defined  in  terms  of 
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the  max-min  rates,  as  explained  in  Section  2. 

Each  session  has  a  buffer  at  each  node  along  its  path  that  can  store  up  to  W  packets  waiting 
for  transmission.  Associated  with  each  buffer  are  W  logical  quantities  called  permits.  Each 
packet  waiting  in  a  buffer  must  hold  a  permit  for  that  particular  buffer.  Permits  for  a  buffer 
that  are  not  currently  held  by  packets  in  that  buffer  are  stored  at  the  node  immediately 
upstream.  Ulien  a  packet  is  transmitted  from  the  n**  node  of  its  path  to  the  (n  +  l)*  node,  it 
relinquishes  the  permit  it  needed  for  waiting  at  node  n,  and  it  seizes  a  permit  for  waiting  at 
node  n-f-l.  (If  no  such  permits  are  available  at  node  n,  the  packet  cannot  be  transmitted.) 
During  the  same  time  slot  in  which  the  packet,  carrying  its  new  permit,  is  transmitted  from 
node  n  to  node  a<f  1,  the  relinquished  permit  is  transmitted  from  node  n  back  to  node  a— 1 
over  some  link  in  this  reverse  direction.  This  dtscipline  guarantees  that  the  buffers  will  never 
overflow.  It  is  known  as  link-bp-link  window  flow  control,  with  W  as  the  window  size.  The  link 
capacity  consumed  by  the  overhead  communication  required  to  implement  permits  will  be 
ignored. 

Each  link  t  has  a  round  robin  scheduler  to  decide  which  session  will  use  the  link  during  each 
time  slot.  The  scheduler  at  I  consults  a  fixed  data  structure  consisting  of  session  identifiers 
arranged  in  a  directed  ring.  Each  session  using  /  spears  exactly  once  on  thb  ring.  The 
scheduler  also  maintains  a  variable  called  the  ring  position  identifying  the  session  that  last  sent 
a  packet  over  /.  To  allocate  the  current  time  slot,  the  scheduler  searches  the  ring,  starting  with 
the  session  immediately  following  the  current  ring  position,  until  it  finds  the  first  session  z  that 
has  both  packet(s)  and  permit(s)  available.  If  there  »  such  a  session  z,  it  transmits  a  packet 
over  /  during  the  current  slot,  and  the  ring  position  u  updated  to  z. 

I  RESULTS 

Consider  a  system  that  satisfies  the  assumptions  of  Section  3.  Suppose  that  the  following 
parameters  of  the  system  are  given:  the  network  topology,  the  set  of  sessions  using  the 
network,  the  sessions’  paths,  the  window  size  W,  and  the  round  robin  ring  for  each  link. 
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Suppose  that  the  following  initial  conditions  are  also  given:  the  queue  length  for  each  session  at 
each  link  and  the  ring  position  of  the  scheduler  at  each  link.  Let  L  denote  the  number  of  links 
in  the  network.  Let  H  denote  the  maximum  number  of  links  in  the  path  of  any  session  in  the 
network.  Let  5  denote  the  maximum  number  of  sessions  sharing  any  single  link  in  the  network. 
Define  and  Aj  in  terms  ot  L,H,  S,  and  W  as  follows: 

A,  -  $*■ 

Aa  -  W 

The  results  below  hold  for  this  system,  provided  that  the  window  size  is  larger  than  A^  packets. 

•  The  long-term  average  throughput  of  each  session  exactly  equals  iu  max-min  fair  rate. 

•  The  number  of  packets  transmitted  for  any  session  over  any  link  during  any  time  interval 
is  within  A^  packets  of  the  max-min  fair  amount,  regardless  of  the  length  of  the  time 
interval. 

•  There  exisu  a  time  T  such  that  the  following  statements  are  true  *Jter  T: 

+  The  number  of  packets  transmitted  for  any  session  over  any  link  during  any  time 
interval  is  within  A|  packets  of  the  max-min  fair  amount,  regardless  of  the  length 
of  the  time  interval.  Recall  that  At  is  independent  of  the  window  sise  W. 

Every  session  has  at  least  one  bottleneck  link,  called  a  psre  iottieneek,  where  there 
are  always  packets  and  permits  waiting,  i.e.,  where  the  session  accepts  every 
chance  offered  to  it  by  the  round  robin  link  scheduler.  A  luk  that  is  a  pure 
bottleneck  for  some  session  is  a  pure  bottleneck  for  every  session  bottlenecked 
there. 

■f  The  range  (i.e.,  maximum  minus  minimum)  of  the  queue  length  for  any  given 
session  at  any  given  link  is  at  most  A|  packets. 

4-  The  lengths  of  a  session's  queues  are  related  to  the  locations  of  its  bottleneck 
links.  Buffers  that  are  ’slightly"  upstream  of  bottleneck  links  are  sometimes  full 
and  are  never  empty;  buffers  that  are  "slightly"  downstream  of  bottleneck  links 


are  sometimes  empty  and  are  never  full.  To  make  this  claim  precise,  we  define 
buffer  properties  Pe.Pn  and  Pf  below. 

Pf  :  The  buffer  is  empty  infinitely  often  and  is  never  full. 

P.v  :  The  buffer  is  never  empty  and  is  never  full. 

Pjr  :  The  buffer  is  full  infinitely  often  and  is  never  empty. 

We  will  now  characterize  the  buffers  of  a  given  session  with  respect  to  properties 
Pe  ,  P,v  and  Pp  .  All  buffers  upstream  of  the  session’s  first  bottleneck  link  satisfy 
property  Pp  .  All  buffers  downstream  of  the  last  bottleneck  link  satisfy  Pp  .  The 
set  of  buffers  between  two  successive  bottleneck  links  can  be  partitioned  into  three 
(possibly  empty)  subsets,  with  the  buffers  in  each  subset  being  contiguous. 
Buffers  in  the  upstream  subset  satisfy  Pp  .  Buffers  in  the  downstream  subset 
satisfy  Pp  .  The  middle  subset  can  contain  at  most  one  buffer,  which  must  satisfy 

Ps- 

The  claims  above  can  be  proved  by  induction,  starting  with  those  sessions  having  the 
smallest  max-min  fair  rate,  then  considering  those  sessions  with  the  second>smallest  fair  rate, 
etc.  The  proof  b  given  in  [9j. 

5.  REMARKS  ON  THE  WINDO  W  SIZE 

Recall  that  in  Section  4  the  window  size  b  assumed  to  be  larger  than  A|  ~  i  H^  S^"  .  For 
all  but  the  simplest  networks,  thb  quantity  b  impractically  large.  Large  windows  could  result 
in  substantial  storage  requirements,  high  cross-network  delay,  very  bursty  session  flows,  and 
slow  convergence  of  the  average  session  throughputs  to  the  max-min  fair  rates.  An  important 
question  b  whether  a  large  window  b  actually  necessary  to  guarantee  max-min  fair  session 
throughputs.  The  answer  is  complicated,  because  the  exact  window  size  needed  for  perfect 
fairness  in  a  particular  system  depends  strongly  on  the  network  topology,  the  set  of  sessions  and 
their  routes,  the  order  of  the  sessions  in  the  round  robin  rings,  and  the  initial  queue  lengths 
throughout  the  network.  Unfortunately,  examples  have  been  discovered  for  which  a  very  large 
window  b,  in  fact,  necessary  to  exactly  achieve  the  max-min  fair  rates.  It  is  not  known  how 
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''common"  such  examples  are.  The  natural  next  question  is  this:  for  a  given  small  window  size, 
how  unfair  can  the  session  throughputs  be?  We  are  currently  investigating  this  issue. 

8.  CONCLUSION 

Round  robin  link  scheduling,  in  conjunction  with  conventional  window  flow  control,  can  be 
used  to  achieve  throughput  fairness  In  point-to-point  packet  networks  with  virtual  circuit 
routing.  Assuming  heavy  demand  and  large  flow  control  windows,  it  can  be  proved  that  the 
long-term  average  throughputs  of  the  sessions  are  fair,  in  the  sense  of  maximizing  the  minimum 
session  rate.  The  round-robin  method  is  considerably  simpler  than  some  other  strategies  for 
throughput  fairness.  The  performance  of  round  robin  scheduling  with  smaller  windows  and 
lesser,  more  random  demand  deserves  further  study. 
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